-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feat] Add LiteLLMEmbeddings - Support SemanticChunking through LiteLLM #154
base: development
Are you sure you want to change the base?
Conversation
…run timings (chonkie-ai#156) * [DOCS] Benchmarking update (chonkie-ai#145) * Add wiki 500k benchmark results * Update benchmarks * bahut tej hai chonkie bhai * blah blah --------- Co-authored-by: Bhavnick Minhas <[email protected]> * Update benchmarks with corrected memory usage and size metrics --------- Co-authored-by: Shreyash Nigam <[email protected]>
EmbeddingsRegistry.register( | ||
"litellm", | ||
LiteLLMEmbeddings, | ||
pattern=r"^litellm/|^huggingface/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Dhan996!
I have a doubt on the pattern for litellm; does litellm load litellm/<model-name>
properly? Does it host it's own models?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Litellm doesn't have any models of it's own. I just put that as an initial placeholder when users wanted to initiate the litellm handling client, but they shouldn't use this they should specify which model. I will take this out, but would appreciate other ideas. Currently, I'm thinking any huggingface model should be routed to Litellm.
def test_auto_embeddings_litellm(litellm_identifier): | ||
"""Test that the AutoEmbeddings class can get LiteLLM embeddings.""" | ||
embeddings = AutoEmbeddings.get_embeddings( | ||
litellm_identifier, api_key="your_litellm_api_key" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Dhan996!
Could you have this use OpenAI API Keys and an OpenAI model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, will amend in the next commit.
@pytest.fixture | ||
def embedding_model(): | ||
api_key = os.environ.get("HUGGINGFACE_API_KEY") | ||
return LiteLLMEmbeddings(api_key=api_key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Dhan996!
Same as before, could you transform the tests to use OpenAI as the default with additional tests with Hugginface or any other provider to see if that works and conditioned on if their keys are in the environment?
Thanks!
def __init__( | ||
self, | ||
model: str = 'huggingface/microsoft/codebert-base', | ||
input: List[str] = "Hello, my dog is cute", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Dhan996!
I believe this input is just for checking the embedding response is coming through right? We don't have to offer the user the option to change this as a part of the signature; we can keep it fixed inside the __init__
. It would be a good idea to offer minimal interface for the user as possible.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. you'er right. Will amend.
text = [text] | ||
retries = 5 # Number of retries | ||
wait_time = 10 # Wait time between retries | ||
for i in range(retries): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Dhan996!
Just a doubt, but does LiteLLM do any retries internally?
If they handle it, then we can push any API retries to their end, o/w we should offer retries as a parameter during init.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iirc, when testing it didn't seem to handle retries, but I will play with again, and if it does, I'll update.
Hey @Dhan996! Sorry for the delay in posting the review~ I had this in my draft for a while but I forgot to submit it 😅 Just a few tiny doubts and changes but otherwise looks good~ Thanks! 😊 |
added litellm support to identify and embed with any Huggingface embedding model that has feature extraction
added the same to the registry to support AutoEmbeddings.
added tests for the same.
key things to notice:
timeout is dependent on model size, since the model needs to be loaded onto the local hardware first.
context length, dimensions, and such measures are dependent on the model
token_counter is a callable from litellm, which would also need time to load.
currently only supports huggingface models. Litellm can support more models such as voyage, mistral, etc, but the API keys should be given in parameters.